How Can I Index My Thousands of Photos Effectively and Automatically? An Unsupervised Feature Selection Approach

نویسندگان

  • Juhua Hu
  • Jian Pei
  • Jie Tang
چکیده

Given a large photo collection without domain knowledge (e.g., tourism photos, conference photos, event photos, images wrapped from webpages), it is not easy for human beings to organize or only view them within a reasonable time. In this paper, we propose to automatically extract meaningful semantics from a photo collection named “dimensions” to help people view, search and organize photos conveniently and efficiently. However, due to the lack of additional domain knowledge or content information, existing image retrieval techniques are not applicable. To tackle the problem, we first propose a simple strategy to extract all meaningful semantics from original photos/images as candidate dimensions, and then propose an efficient unsupervised feature/dimension selection method to select a sufficient dimension subset to uniquely index each photo within this collection. Our experiments on several real-world photo/image collections validate both the efficiency and effectiveness of our proposed method.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Creating Capsule Wardrobes from Fashion Images

We propose to automatically create capsule wardrobes. Given an inventory of candidate garments and accessories, the algorithm must assemble a minimal set of items that provides maximal mix-and-match outfits. We pose the task as a subset selection problem. To permit efficient subset selection over the space of all outfit combinations, we develop submodular objective functions capturing the key i...

متن کامل

Automatically Redundant Features Removal for Unsupervised Feature Selection via Sparse Feature Graph

The redundant features existing in high dimensional datasets always affect the performance of learning and mining algorithms. How to detect and remove them is an important research topic in machine learning and data mining research. In this paper, we propose a graph based approach to find and remove those redundant features automatically for high dimensional data. Based on the framework of spar...

متن کامل

Feature Selection and Classification of Microarray Gene Expression Data of Ovarian Carcinoma Patients using Weighted Voting Support Vector Machine

We can reach by DNA microarray gene expression to such wealth of information with thousands of variables (genes). Analysis of this information can show genetic reasons of disease and tumor differences. In this study we try to reduce high-dimensional data by statistical method to select valuable genes with high impact as biomarkers and then classify ovarian tumor based on gene expression data of...

متن کامل

Reconstruction-based Unsupervised Feature Selection: An Embedded Approach

Feature selection has been proven to be effective and efficient in preparing high-dimensional data for data mining and machine learning problems. Since real-world data is usually unlabeled, unsupervised feature selection has received increasing attention in recent years. Without label information, unsupervised feature selection needs alternative criteria to define feature relevance. Recently, d...

متن کامل

A Clustering-Based Approach to Reduce Feature Redundancy

Research effort has recently focused on designing feature weighting clustering algorithms. These algorithms automatically calculate the weight of each feature, representing their degree of relevance, in a data set. However, since most of these evaluate one feature at a time they may have difficulties to cluster data sets containing features with similar information. If a group of features conta...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014